clustering - partitional clustering : why would we need to choose for visualizing data ?
نویسندگان
چکیده
This paper combines three exploratory data analysis methods, principal component methods, hierarchical clustering and partitioning, to enrich the description of the data. Principal component methods are used as preprocessing step for the clustering in order to denoise the data, transform categorical data in continuous ones or balanced groups of variables. The principal component representation is also used to visualize the hierarchical tree and/or the partition in a 3D-map which allows to better understand the data. The proposed methodology is available in the HCPC (Hierarchical Clustering on Principal Components) function of the FactoMineR package.
منابع مشابه
Principal component methods - hierarchical clustering - partitional clustering : why would we need to choose for visualizing data ?
This paper combines three exploratory data analysis methods, principal component methods, hierarchical clustering and partitioning, to enrich the description of the data. Principal component methods are used as preprocessing step for the clustering in order to denoise the data, transform categorical data in continuous ones or balanced groups of variables. The principal component representation ...
متن کاملخوشهبندی خودکار دادهها با بهرهگیری از الگوریتم رقابت استعماری بهبودیافته
Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...
متن کاملComputation of Initial Modes for K-modes Clustering Algorithm Using Evidence Accumulation
Clustering accuracy of partitional clustering algorithm for categorical data depends primarily on the choice of initial data points to instigate the clustering process and hence the clustering results cannot be generated and repeated consistently. In this paper we present an approach to compute initial modes for K-mode partitional clustering algorithm to cluster categorical data sets. Here we u...
متن کاملA Particle Swarm Optimization based fuzzy c means approach for efficient web document clustering
There is a need to organize a large set of documents into categories through clustering so as to facilitate searching and finding the relevant information on the web with large number of documents becomes easier and quicker. Hence we need more efficient clustering algorithms for organizing documents. Clustering on large text dataset can be effectively done using partitional clustering algorithm...
متن کاملC ONSTRAINT BASED P ARTITIONAL C LUSTERING – A C OMPREHENSIVE S TUDY AND A NALYSIS Aparna
Data clustering is the concept of forming predefined number of clusters where the data points within each cluster are very similar to each other and the data points between clusters are dissimilar to each other. The concept of clustering is widely used in various domains like bioinformatics, medical data, imaging, marketing study and crime analysis. The popular types of clustering techniques ar...
متن کامل